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' Application/Control Number: 10/777,072 
Art Unit: 2626 

DETAILED ACTION 

This is the initial response to the application filled February 13, 2004. Claims 1 
pending and are considered below. 

Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

Claims are rejected under 35 U.S.C. 102(b) as being anticipated by Johnson ("A 
Semantic Lexicon for Medical Language Processing" JAM I A 1999). 

1 . As per claim 1 , Johnson discloses an apparatus for recognizing a biological 
named entity from biological literature based on united medical language system 
(UMLS), comprising: 

A resource construction unit for receiving metathesaurus from the UMLS and 
constructing a concept name database, a single name database and a category, 
keyterm database, which are language resources to be used to recognize a named 
entity (page 211, Lexical Matching, each lexeme in the specialists lexicon is matched to 
terms in the metathesaurus. Once a matched has been made, the lexeme (single 
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name), semantic type (concept name) and all derivational and inflectional variants 
(category keyterm) are obtained and added to the semantic lexicon (database); 

A rule collection unit for receiving each concept name stored in the concept 
name database, extracting features of each of the concept names by using data stored 
in the single name database and the category keyterm database, and constructing a 
rule database by creating a rule used to recognize the named entity and filtering the rule 
by using the extracted features (page 211-21 3, if one member of a pair of semantic 
types (concept name) is preferred for lexical items, including variants, (single names 
and keyterms) assigned to that pair, then a preference rule is determined. The rule is 
then assigned to each lexeme and variant in the semantic lexicon). 

A named entity recognition unit for receiving a biological literature, extracting 
nouns and noun phrases that are candidate named entities, applying the rules stored in 
the rule database to the nouns and the noun phrases, and recognizing the named 
entities (page 21 1 , Corpus Matching, Contiguous word sequences were extracted from 
a corpus of discharge summaries and matched against the semantic lexicon). 

2. As per claim 2, Johnson discloses the apparatus of claim 1 , wherein the 
resource construction unit extracts concept names from the metathesaurus of the 
UMLS, which is divided according to the semantic categories, to construct the concept 
names database, processes the concept name stored in the concept name database to 
extract single names and category keyterms, and constructs the single name database 
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and the category keyterm database by using the extracted single names and category 
keyterms (page 211, Lexical Matching, each lexeme in the specialists lexicon is 
matched to terms in the metathesaurus. Once a matched has been made, the lexeme 
(single name), semantic type (concept name) and all derivational and inflectional 
variants (category keyterm) are obtained and added to the semantic lexicon (database); 

3. As per claim 3, Johnson discloses the apparatus of claim 1 , wherein the rule 
collection unit extracts the feature of a token constituting each of the concept names 
stored the concept name database, creates the rules by combining the extracted 
features, weights the rules, filters the weighted rules with a threshold, and stores the 
filtered rules in the rule database (page 21 1-213, lexemes (single names) and their 
inflectional variants (keyterms) are determined for a pair of semantic types (concept 
names) by comparing the semantic lexicon to the metathesaurus. Then discharge 
summaries are examined to determine which semantic type is used more frequently 
(weights). The semantic type that is used more frequently is preferred (filtered), and 
thus assigned to the lexeme and inflectional variants in the semantic lexicon). 

4. As per claim 4, Johnson discloses the apparatus of claim 1 , wherein the named 
entity recognition unit extracts the candidate named entities from the literature provided 
through a literature input unit, extracts the feature of each of the tokens constituting the 
candidate named entity, creates a rule used to determine the candidate named entity by 
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combining the extracted feature, compares the created rule with the rule stored in the 
rule database to extract an existing rule suitable for the candidate named entity, applies 
a weight value of each of the extracted rules and a heuristic used to determine a 
category of the named entity, determines a final semantic category for the candidate 
named entity, and recognizing the named entity (page 21 1 , Corpus Matching, and page 
216, Results, Contiguous word sequences were extracted from a corpus of discharge 
summaries and matched against the semantic lexicon. The semantic lexicon contains 
lexemes and semantic types combined to make a rule; therefore a rule must have been 
extracted from the literature in order to compare it to the semantic lexicon. In addition, 
preference rules (weight) are applied to the corpus to determine the semantic type). 

5. As per claim 5, Johnson discloses a method for recognizing a biological named 
entity from biological literature based on UMLS, the method comprising the steps of: 

(a) receiving metathesaurus from the UMLS, extracting concept names, single 
names and category keyterms, which are language resources to be used to recognize a 
named entities, and constructing a concept name database, a single name database 
and a category keyterm database (page 211, Lexical Matching, each lexeme in the 
specialists lexicon is matched to terms in the metathesaurus. Once a matched has been 
made, the lexeme (single name), semantic type (concept name) and all derivational and 
inflectional variants (category keyterm) are obtained and added to the semantic lexicon 
(database); 
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(b) extracting features of the concept name by using the language resources 
stored in each of the databases, constituting a rule for the extracted features, storing the 
constituted rule in a rule database (page 211-213, if one member of a pair of semantic 
types (concept name) is preferred for lexical items, including variants, (single names 
and keyterms) assigned to that pair, then a preference rule is determined. The rule is 
then assigned to each lexeme and variant in the semantic lexicon); and 

(c) receiving a literature, extracting features of a candidate named entity, creating 
a rule used to determine the candidate named entity by combining the extracted 
features, comparing the created rule with the rules stored in the rule database, and 
determining a final semantic category by using a result of comparison (page 211, 
Corpus Matching, Contiguous word sequences were extracted from a corpus of 
discharge summaries and matched against rules in the semantic lexicon). 

6. As per claim 9, Johnson discloses the method of claim 5, wherein the step (b) 
comprises the steps of: (b-1) extracting the features from each of the concept names 
stored in the concept name database according to a token, and (b-2) constituting the 
rule by combining the tokens whose features are extracted, calculating weight value of 
the constituted rule, filtering the rules with their weight values, and storing the filtered 
rules in the rule database (page 21 1 -21 3, lexemes (single names) and their inflectional 
variants (keyterms) are determined for a pair of semantic types (concept names). Then 
discharge summaries are examined to determine which semantic type is used more 
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frequently (weights). The semantic type that is used more frequently is preferred 
(filtered), and thus assigned to the lexeme and inflectional variants in the semantic 
lexicon). 

7. As per claim 10, Johnson discloses the method of claim 9, wherein in the step 
(b-1 ), the feature of the tokens of each of the concept names stored in the concept 
name database is extracted using the features of the category keyterm, the single name 
and a capital letter expression, an alphanumeric, a special character, a preposition or 
conjunction, which are features defined to reflect characteristics of the biological named 
entity, and a subtype of each of the features (page 21 1-213, lexemes (single names) 
and their inflectional variants (keyterms) are determined for a pair of semantic types 
(concept names) by matching the semantic lexicon to the metathesaurus. Each lexeme 
and its variant is matched using first word or letter uppercase, numbers in brackets, a 
NOS (not otherwise specified) character, and the first preposition in the head noun). 

8. As per claim 1 1 , Johnson discloses the method of claim 9, wherein the step (b- 
2) comprises the steps of: receiving the result in which the concept name is tokenized 
and the features are extracted at the step (b-1 ), and creating the rules as many as the 
number of combinations of subtypes according to the subtypes of the features of the 
token; and calculating appearance distribution of the rule in each category on all the 
created rules, filtering the rules with the threshold, and constructing the rule database 
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(page 21 1-213, lexemes (single names) and their inflectional variants (keyterms) are 
determined for a pair of semantic types (concept names). Then discharge summaries 
are examined to determine which semantic type is used more frequently (appearance 
distribution). The semantic type that is used more frequently is preferred (filtering), and 
thus assigned to the lexeme and inflectional variants in the semantic lexicon). 

9. As per claim 12, Johnson discloses the method of claim 5, wherein the step (c) 
comprises the steps of: (c-1 ) extracting nouns and noun phrases, which are candidate 
named entities, from the inputted literature; (c-2) extracting features of each token of a 
candidate named entity; (c-3) combining the features extracted from each of the tokens 
of the candidate named entity, and creating the rule used to determine the candidate 
named entity; (c-4) comparing the created rule with the rules stored in the rule 
database; and (c-5) determining the final semantic category of the candidate named 
entity (page 211, Corpus Matching, and page 216, Results, Contiguous word 
sequences were extracted from a corpus of discharge summaries and matched against 
the semantic lexicon. The semantic lexicon contains lexemes and semantic types 
combined to make a rule; therefore a rule must have been extracted from the literature 
in order to compare it to the semantic lexicon. In addition, preference rules (weight) are 
applied to the corpus to determine the semantic type). 
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10. As per claim 14, Johnson discloses the method of claim 12, wherein in the step 
(c-5), the final semantic category of the candidate named entity is determined using 
weight values of existing rules extracted at the step (c-4) and a heuristic used to 
determine a category of the named entity, and outputted as a result of recognizing the 
named entity (page 21 1 , Corpus Matching, and page 216, Results, Contiguous word 
sequences were extracted from a corpus of discharge summaries and matched against 
the semantic lexicon. In addition, preference rules (weight) are applied to the corpus, 
which the system used to determine the semantic type). 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claim 14 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Johnson in view of Veale (6,584,470). 



11. As per claim 1 3, Johnson discloses the method of claim 12, however Johnson 
does not disclose wherein in the step (c-4), existing rules suitable to determine the 
candidate named entity are extracted an existing rule by comparing the rule used to 
determine the candidate named entity with the rules stored in the rule database in 
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manners of exact match, partial match and nested match. Veale discloses a system for 
named entity extraction for answering natural language questions (Abstract). In Veale, a 
four-pass search is performed where each pass performs a matching algorithm with 
different degrees of broadness. The first pass determines an exact match, passes two 
and three use synonym information to determine exact and partial matches, while pass 
four determines partial matches (column 20 lines 13-30). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to compare the rule used to determine the candidate named entity with 
the rules stored in the rule database in manners of exact match, partial match and 
nested match in Johnson, since it would enable the system to utilize different elements 
of lexical knowledge for each match and allow the use to control a trade off between 
system accuracy and real-time performance. 

Allowable Subject Matter 

12. Claims 6-8 are objected to as being dependent upon a rejected base claim, but 
would be allowable if rewritten in independent form including all of the limitations of the 
base claim and any intervening claims. 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Please see the pto-892 from. 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dorothy Sarah Siedler whose telephone number is 571- 
270-1067. The examiner can normally be reached on Mon-Thur 9:30am-5:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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